DNA methylation patterns reflect individual's lifestyle independent of obesity

Abstract Objective Obesity is driven by modifiable lifestyle factors whose effects may be mediated by epigenetics. Therefore, we investigated lifestyle effects on blood DNA methylation in participants of the LIFE‐Adult study, a well‐characterised population‐based cohort from Germany. Research design and methods Lifestyle scores (LS) based on diet, physical activity, smoking and alcohol intake were calculated in 4107 participants of the LIFE‐Adult study. Fifty subjects with an extremely healthy lifestyle and 50 with an extremely unhealthy lifestyle (5th and 95th percentiles LS) were selected for genome‐wide DNA methylation analysis in blood samples employing Illumina Infinium® Methylation EPIC BeadChip system technology. Results Differences in DNA methylation patterns between body mass index groups (<25 vs. >30 kg/m2) were rather marginal compared to inter‐lifestyle differences (0 vs. 145 differentially methylated positions [DMPs]), which identified 4682 differentially methylated regions (DMRs; false discovery rate [FDR <5%) annotated to 4426 unique genes. A DMR annotated to the glutamine‐fructose‐6‐phosphate transaminase 2 (GFPT2) locus showed the strongest hypomethylation (∼6.9%), and one annotated to glutamate rich 1 (ERICH1) showed the strongest hypermethylation (∼5.4%) in healthy compared to unhealthy lifestyle individuals. Intersection analysis showed that diet, physical activity, smoking and alcohol intake equally contributed to the observed differences, which affected, among others, pathways related to glutamatergic synapses (adj. p < .01) and axon guidance (adj. p < .05). We showed that methylation age correlates with chronological age and waist‐to‐hip ratio with lower DNA methylation age (DNAmAge) acceleration distances in participants with healthy lifestyles. Finally, two identified top DMPs for the alanyl aminopeptidase (ANPEP) locus also showed the strongest expression quantitative trait methylation in blood. Conclusions DNA methylation patterns help discriminate individuals with a healthy versus unhealthy lifestyle, which may mask subtle methylation differences derived from obesity.


Funding information
4682 differentially methylated regions (DMRs; false discovery rate [FDR <5%) annotated to 4426 unique genes. A DMR annotated to the glutamine-fructose-6phosphate transaminase 2 (GFPT2) locus showed the strongest hypomethylation (∼6.9%), and one annotated to glutamate rich 1 (ERICH1) showed the strongest hypermethylation (∼5.4%) in healthy compared to unhealthy lifestyle individuals. Intersection analysis showed that diet, physical activity, smoking and alcohol intake equally contributed to the observed differences, which affected, among others, pathways related to glutamatergic synapses (adj. p < .01) and axon guidance (adj. p < .05). We showed that methylation age correlates with chronological age and waist-to-hip ratio with lower DNA methylation age (DNAmAge) acceleration distances in participants with healthy lifestyles. Finally, two identified top DMPs for the alanyl aminopeptidase (ANPEP) locus also showed the strongest expression quantitative trait methylation in blood. Conclusions: DNA methylation patterns help discriminate individuals with a healthy versus unhealthy lifestyle, which may mask subtle methylation differences derived from obesity.

K E Y W O R D S
alcohol, diet, DNA methylation, epigenetic clock, epigenetics, lifestyle score, physical activity, smoking

OBJECTIVE
Obesity is well recognised as a multifactorial disease in most modern societies, with not only individuals' genetic background contributing to the disease burden but also with a crucial role of lifestyle and environment, strongly influencing epigenetic mechanisms controlling metabolic processes. 1 However, common lifestyle intervention regimes vary greatly in structure and length and, thus, in their individual success on weight reduction (reviewed in Aronica et al. 2 ). Observed direct effects, for example, on DNA methylation patterns after short-term lifestyle interventions, are often marginal, 2 which might be due to their short duration and low intensity. 3 Recent studies have demonstrated that successful short-term weight loss interventions may reduce methylation age (mAge) to the chronological age level. 4 Furthermore, DNA methylation patterns may predict the success of lifestyle-induced weight loss. [5][6][7] Comprehensive studies investigating the underlying interaction between genetics, epigenetics and especially lifestyle are currently lacking. Therefore, we (1) analysed and compared the human blood DNA methylation patterns between subjects living a healthy lifestyle and those living an unhealthy lifestyle. (2) We further compared obese and nonobese subjects to identify DNA methylation patterns, which are related to an obese phenotype despite a healthy lifestyle or potentially associated with a healthy (lean) phenotype in an allegedly unhealthy obesogenic environment. (3) We elucidated lifestyle-specific effects on the epigenetic clock. (4) Finally, we investigated the role of genetic variants cis to the identified target regions by methylation quantitative trait loci (meQTL) analyses and addressed the potential consequences of these changes on the blood transcriptome by matrix expression quantitative trait methylations (eQTMs).

Study population
The present analyses included participants of the LIFE-Adult study, a population-based cohort of European ancestry, focusing on lifestyle diseases. 8,9 The cohort comprised ∼10 000 adult subjects aged from 18 to 80 years (mean ± standard deviation [SD]: age = 57.4 ± 12.5 years, body mass index [BMI] = 27.3 ± 4.9 kg/m 2 ) from the region of Leipzig, Germany. All participants underwent extensive phenotyping, including anthropometric measurements, social and lifestyle-behaviour questionnaires and blood parameters. For most subjects, ethylenediaminetetraacetic acid (EDTA) blood samples are available. 8 All participants gave written informed consent to participate in the

Lifestyle score
We created a lifestyle score (LS) as the sum of four different subscores: diet, physical activity (PA), alcohol consumption and smoking. 10 To calculate the scores, we included data from four self-reported questionnaires: (1) a German version of the Food Frequency Questionnaire, 11 (2) the Short-Form International Physical Activity Questionnaire, 12 (3) a questionnaire about smoking status and quantity, and (4) a questionnaire about daily alcohol consumption and frequency. The final LS ranged from 3 to 66 (mean ± SD: 27.19 ± 11.2), with low and high LS values representing a healthy and an unhealthy lifestyle, respectively. All subscores showed an inter-item correlation, as demonstrated by Cronbach's alpha statistic (α = 0.64). A detailed description of the individual scoring and an explanation of each subscore can be found in Supporting  Information and Table S1. Subjects with any missing questionnaire items were completely excluded from further analyses to avoid potential effects caused by general noncompliance of those subjects. Similarly, participants with pre-existing diabetic conditions (HbA1c ≥ 6.5%) 13 or missing BMI measures were also excluded from subsequent analyses. A total of 4107 subjects passed all criteria (Table 1).

Subset for genome-wide methylation and validation measurements
Based on the LS calculation, we stratified the cohort into two groups reporting the most healthy and unhealthy lifestyles by selecting the lowest and highest 5% (5th percentile LS ≤ 11; 95th percentile LS ≥ 48). Within these groups (N = 234), we found 140 subjects with and 94 subjects without obesity according to BMI criteria. 14 Based on this and an equal age range ( Figure S1B), we further selected 25 subjects without (BMI < 25 kg/m 2 ) and 25 subjects with obesity (BMI > 30 kg/m 2 ) among each subgroup (5th vs. 95th percentile) (total N = 100) for the genome-wide methylation discovery cohort and included all (with and without obesity) subjects with sufficient available DNA (N = 213) for validation analysis. Thus, both groups overlap in N = 100 samples and are therefore not independent.

Sample preparation
All samples were isolated, stored and maintained at the Leipzig Medical Biobank 15 according to standard protocols. Briefly, blood samples were taken after an overnight fast (mean fasting duration 12.7 ± 1.7 h) during the individuals' study visit and stored at 4 • C-8 • C until DNA isolation (within 48 h after blood withdrawal) on the Autopure LS platform (Qiagen, Germany) using chemistry by Qiagen and Stratec Molecular (Stratec, Germany). Genomic DNA samples were stored at -80 • C prior to integrity control using gel electrophoresis and concentration measurements of double-stranded DNA using Quant-iT PicoGreen dsDNA (Invitrogen, ThermoFisher Scientific, Germany) and Quantus (Promega, Germany) technologies.

Genome-wide DNA methylation analysis
Five hundred nanograms of genomic DNA was taken for bisulphite conversion using an EZ DNA Methylation Gold Kit (Zymo Research, Netherlands). After quality control (QC), amplification and hybridisation on Illumina HumanMethylation850 Bead Chips (Illumina, Inc., San Diego, CA, USA), the Illumina iScan array scanner was used to quantify genome-wide DNA methylation levels at 850K CpG sites per sample at single-nucleotide resolution.
Raw data were first quality controlled using the QC report of the minfi R package (version 1.38.0). 16,17 Two samples that did not pass the badSampleCutoff of 10.5 were excluded during normalisation steps. Beta densities and control probes were within predicted specifications. Probes that did not pass the detection p-value (p detect = .01) in more than 1% of all 98 samples were excluded from the analysis (17 375 probes). Cross-reactive probes (38 924 probes) 18 and probes containing known single-nucleotide polymorphisms (SNPs) (29 383 probes) at the CpG site (DNA region where a cytosine nucleotide is followed by a guanine nucleotide) were also filtered out by applying the maxprobes (version 0.0.2) and minfi R packages, respectively. In addition, probes on sex chromosomes were removed from the analysis subset (19 627 probes), as sex represents a larger source of variation in our methylation data. In total, 760 550 probes remained for the analysis. β-Value generation and quantile normalisation were computed using the minfi R package 16,17 and adjusted for sex-specific batch effects (see Figure S1A). Furthermore, we analysed the cell type composition using the Houseman approach 19-21 adapted to EPIC arrays by Salas et al. 21 Possible differences in cell type composition were (see Figure S2) analysed using Wilcoxon tests in R. We corrected β-values for cell type composition in an attempt to reduce noise, 21 although none of the cell type populations differed strongly between the subgroups (low and high LS) ( Figure S2A) and the low and high LS subgroups in individuals without and with obesity ( Figure S2B). Differential methylation analyses were performed between subjects with extremely healthy and unhealthy lifestyles (low vs. high LS/5th vs. 95th percentile) as well as between participants without (BMI < 25 kg/m 2 ) and with obesity (BMI > 30 kg/m 2 ) within each lifestyle subgroup. The established R package limma (version 3.48.3) was used to identify differentially methylated CpG sites. 22 Data assignment to technical and biological information by principal component analysis using the R package SWAMP (version 1.5.1) 23 showed that array slides were the primary batch for which we adjusted accordingly ( Figure  S3). Differentially methylated positions (DMPs) describe differences in methylation levels of single CpG positions with an adj. p-value <.05. Differentially methylated regions (DMRs) were extracted by applying DMRcate (version 2.6.0), 24 which uses Gaussian kernel smoothing to find patterns of differential methylation independent of genomic annotation. Only DMRs with more than two CpG sites were reported. DMRs with a minimum smoothed FDR <5% were defined as differentially methylated. DMRs with a mean methylation difference >±2% were further annotated to CpG islands (CpG shores, CpG shelves and inter-CpG islands (CGIs)) and gene context-related regions (promoters, 5ʹ untranslated regions (UTRs), exons, introns, 3ʹUTRs and intergenic regions). Genomic annotation was performed using the annotatr R package (version 1.18.1) 25 with respect to multiple annotations. To elucidate putative drivers of blood DNA methylation, separate analyses with individual covariates (smoking, diet, PA, alcohol, BMI and age) were performed. Intersection analysis for covariate-specific effects on lifestyle DMRs was performed using the UpsetR (version 1.4.0) package. 26 EPIC raw data are available at the Leipzig Health Atlas under https://www.health-atlas.de/ studies/57.

2.6
Methylation age and telomere length clocks DNA methylation age (DNAmAge), corresponding DNA-mAge acceleration differences according to Horvath's clock (I and II), Levine's clock and the telomere length clock were estimated using the R package methylclock (version 0.7.5). 27

KEGG pathway overrepresentation
Candidate genes identified by significant DMRs (minimum smoothed FDR <5%) characterising lifestyle-specific methylation differences (healthy vs. unhealthy living subjects) and differences between subjects without and with obesity were taken forward for a KEGG pathway overrepresentation test using clusterProfiler::enrichKEGG (version 3.18.1). 28 Enrichment p-values were adjusted using Benjamini-Hochberg correction, and FDR <5% was considered statistically significant.

Validation of candidate CpGs
We selected two top candidate DMPs (Tables 2 and S8) from our discovery cohort (high LS vs. low LS) for additional validation using bisulphite sequencing. Briefly, 300 ng of genomic DNA was bisulphite converted using the EpiTect Fast DNA Bisulfite Kit (Qiagen). After a whole genomic amplification using the EpiTect Whole Bisulfitome Kit (Qiagen), candidate regions were amplified and sequenced using the PyroMark Q24 platform and self-designed assays for retinoic acid receptor alpha (RARA) and F2R like thrombin or trypsin receptor 3 (F2RL3) candidate DMPs (Qiagen). Primer sequences are shown in Table S3. All analyses were performed in duplicate, including two non-template controls per sequence run.

Transcriptome data
Transcriptome data were available from Illumina HT-12 v4 Expression Bead Chips (Illumina) using whole blood RNA samples from the LIFE-Adult cohort as described elsewhere. 8,29 Data processing was performed using R/Bioconductor after extraction of all 47 231 geneexpression probes using Illumina GenomeStudio without background correction. Furthermore, expression values were log2 transformed and quantile normalised, 30,31 and batch effect correction was performed using an empirical Bayes method. 32 Probes were excluded when expressed in less than 5% of the (subgroup-specific) samples (detected by Illumina GenomeStudio), still being associated with batch effects after Bonferroni correction or not mapping to a gene accordingly 33 (accessed on 4 April 2019). Additionally, gene probes without available annotation and genes on the X and Y chromosomes were removed to determine the effects introduced by sex. In summary, 20 114 valid gene-expression probes were identified corresponding to 14 687 single genes in the human genome (hg19). A three-step procedure was used to remove poor quality samples: (1) first, the number of detected gene-expression probes of a sample was required to be within ±3 interquartile ranges (IQR) from the median, (2) the Mahalanobis distance of several quality characteristics of each sample (signal of AmbionTM ERCC spike-in control probes, signal of biotin-control probes, signal of low-concentration control probes, signal of medium-concentration control probes, signal of mismatch control probes, signal of negative control probes and signal of perfect-match control probes) 34

Genotype data
For genotypes, 7838 LIFE-Adult participants were genotyped using the genome-wide SNP array Affymetrix Axiom CEU1 and the software Affymetrix Power Tools (version 1.20.6). QC of the genotyped data was performed following Affymetrix's data analysis guide 35 as previously described. 36 QC according to Affymetrix's data analysis guide included dish-QC (<0.82), sample call rate (<97%), sex mismatch, ambiguous relatedness (e.g., sample mixup) and abnormalities of XY intensity plots (e.g., XXY samples filtered for gonosomal analyses). Genetic heterogeneity was evaluated with principal component analyses, and outliers (>6 SD in any of the first 10 principal components) were removed. The criteria call rate, parameters of cluster plot irregularities according to Affymetrix's recommendations, violation of Hardy-Weinberg equilibrium (p-value < 10 -6 ) in an exact test for autosomes (pvalue < 10 -4 ), for chromosome X with females only 37  were imputed on the reference 1000 Genomes Phase 3, 38 applying SHAPEIT 39 v2r900 (prephasing) and IMPUTE2 (version 2.3.2) 39,40 for genotype estimation. A specific genotype was assigned to a SNP if its corresponding genotype estimation featured a probability of ≥.8. 41 In 2.5% of the cases, none of the genotypes exceeded that threshold, and the respective SNP was labelled 'missing' for that sample. SNPs whose 'missing' count over all samples exceeded the upper quartile +1.5 × IQR were removed, resulting in a total of 2830 SNPs.

matrixEQTL analysis
Among samples with significant DMP data (from DMRs healthy vs. unhealthy), additional gene-expression and SNP data were available for 48 samples. The R package matrixEQTL (version 2.3) 42 was employed on all three pairs of datasets to identify cis effects (within a range of ±1 kb) between methylation and expression (eQTMs), methylation and SNPs (meQTLs) and expression and SNPs (eQTLs). All three comparisons were performed on the complete data (N = 48) and both subgroups with high LS (N = 23) and low LS (N = 25) separately. Since sex batch effects have been adjusted for in both expression and methylation data, small batch effects remained only for age and BMI. However, including both age and BMI as covariates into the matrixEQTL analysis did not change the overall result, which is why the final matrixEQTL analysis was run without considering any covariances.

Statistics
All statistical analyses were performed using R software (version 4.0.4). 43 After checking for normal distribution, Mann-Whitney U-test or Welch's t-test was applied to test for differences between the 5th and 95th percentiles as well as between lean and obese subgroups for the following phenotypes: BMI, age, HbA1c, waist-to-hip ratio (WHR), fasting plasma glucose and insulin, low-density lipoprotein (LDL), high-density lipoprotein (HDL), apolipoprotein A1 (Apo A1) and triglyceride serum levels. Welch's t-test was used to compare methylation differences measured as normalised β-values between low LS versus high LS for each top DMP, respectively. Using the Shapiro-Wilk test to prove the normal distribution of the bisulphite sequencing data, an independent Mann-Whitney U-test was applied to compare methylation differences within the validation cohort. Methylation levels between BMI categories were compared by applying two-way analysis of vari-ance (ANOVA). Correlation analysis was performed using Spearman's correlation. All respective analyses were adequately corrected for multiple testing.

Self-reported lifestyle reflects obesity-specific phenotypes
We correlated the LS with BMI and WHR in 4107 LIFE-Adult participants (mean ± SD: age = 56 ± 13 years, BMI = 27.0 ± 4.7 kg/m 2 , LS = 27.19 ± 11.02) ( Table 1). LS was related to the obesogenic environment ( Figure  S4A,B) (all p-value < 1 × 10 -3 ) with significantly higher values in subjects with obesity ( Figure S4C). We further demonstrated that all individual scores (diet, PA, smoking, alcohol and total LS) were mutually dependent ( Figure 1A), which was particularly marked in the extreme subgroups (5th and 95th percentile, Figure 1B). Finally, our score showed simultaneous negative correlations (all pvalue < 1 × 10 -15 ) to the protective lipid parameters HDL cholesterol and Apo A1, which were higher in healthy living subjects ( Figure S5A,B).

DNA methylation signatures are related to individual's lifestyle
By comparing genome-wide blood DNA methylation patterns in subjects with healthy versus unhealthy lifestyles, we identified 4682 significant DMRs annotated to 4426 genes with a minimum smoothed FDR <5%, which included 220 DMRs with FDR <1 × 10 -4 (Table S4, Figure 2A). Among the significant DMRs, the mean methylation level differences ranged from -6.9% to 5.5%.
Given the rather subtle methylation changes for the majority of the DMRs, we introduced a mean methylation threshold of >2% to further narrow down the potential causal candidate DMRs. Among the 340 DMRs reaching this cut-off, 164 DMRs were hypermethylated (mean methylation difference ± SD: 2.6 ± 0.6%), whereas 176 DMRs were hypomethylated (mean methylation difference ± SD: -2.8 ± 0.8%) in healthy compared to unhealthy living individuals (Figure 2A, Table S4). Taking into account that a DMR can have more than one annotation, most DMRs (46%; counts relative to the number of DMRs) were located in CpG islands, followed by 45% located in CpG shores. In relation to gene regions, most DMRs are located in introns (59%), followed by exons (39%) ( Figure 2B).
The top 15 hypomethylated and hypermethylated significant DMRs according to their mean methylation difference are presented in Table 3 with a DMR annotated to the glutamine-fructose-6-phosphate transaminase 2 (GFPT2) gene locus showing the strongest hypomethylation (mean methylation difference DMR = -6.9%). A DMR annotated to the glutamate rich 1 (ERICH1) gene showed the strongest hypermethylation (mean methylation difference DMR = 5.4%). Finally, using KEGG pathway analyses, we identified glutamatergic synapses as the most enriched pathway (adj. p-value < .01) followed by axon guidance, another brain-related pathway (adj. p-value < .05). Most of the nine enriched pathways (Table S5, Figure 2C) are related to various cancer types.
As demonstrated by the intersection plot ( Figure 2D, Table S6), the majority of the DMRs (N = 1952) were driven by all four lifestyle subscores together (diet, PA, smoking and alcohol), followed by a combination of them together with BMI and age (N = 743). Obviously, BMI and age alone do not explain any of the identified DMRs. Although this did not indicate a prominent role of smoking ( Figure 2D), given the nature of the LS, a comparison between participants with very healthy and very unhealthy lifestyles mirrors differences between nonsmokers and smokers. We therefore further adjusted the complete analyses for smoking as a covariate, which resulted in 629 identified DMRs with a minimum smoothed FDR <5%. Among them, the most significant DMR is located within a CpG island of the ring finger protein 39 (RNF39) locus (Table S7).

Obesity-specific methylation marks
Driven by the comparable distribution of subjects with and without obesity between the very healthy and unhealthy lifestyle groups, we aimed to identify lifestyle-independent obesity-related methylation marks. Therefore, the blood methylation patterns of subjects with obesity (N = 25) were compared with those of subjects without obesity (N = 25) within each lifestyle group separately. Interestingly, whereas approximately 1572 DMRs annotated to 1599 different genes were identified in healthy subjects, only 85 DMRs annotated to 101 genes were detected in subjects living an unhealthy lifestyle (Tables S10 and S11) with a minimum smoothed FDR <5%. This further included 10 identical annotations among the PAX6 and HOXA9-10 gene clusters, already known candidates regarding obesity and related comorbidities. However, at CpG levels, no DMPs were sustained after correction for multiple testing (data not shown). Nevertheless, KEGG pathway analysis for the healthy subgroup indicated eight enriched pathways, among them GABAergic synapse, dilated cardiomyopathy and calcium signalling ( Figure 4, Table S12), whereas for the unhealthy subgroup, only antigen processing and presentation were enriched (not shown).

Methylation age
We observed the strongest association (p-value < 1 × 10 -10 , R 2 = 0.37, Figure 3E) between mAge and subjects' chronological age within the discovery cohort for Horvath's clock II, which was compared to Horvath's clock I, which was additionally trained on 850K EPIC arrays ( Figure S6A-C).
Only marginal (p-value = .01) differences in DNAmAge acceleration were observed when comparing individuals with healthy versus unhealthy lifestyles, which was similar to comparing never smokers with previous or current smokers ( Figure S6D,F). No difference was observed between subjects with and without obesity ( Figure S6E). We further observed a strong negative association between the telomere length clock and chronological age (pvalue < 1 × 10 -8 , R 2 = -0.32, Figure 3E). Interestingly, both clocks showed an additional linear association with WHR within our discovery cohort (all p-value < 1 × 10 -4 , Figure 3E).

Underlying genetic predispositions and effects on mRNA levels in blood
Driven by the small overlapping sample size (N = 48) and only marginal genetic variation in close proximity (±1 kb) to the identified target DMRs (healthy vs. unhealthy lifestyle), we could not identify any meQTLs or eQTLs. However, we found associations between the methylation levels of eight DMPs with target mRNA expression levels (Table S13) in the combined discovery group (all individuals with healthy and unhealthy lifestyles). Among them, p-Values indicate statistically significant differences detected using Welch's t-test. (C and D) Box plots are given as the mean methylation ± SD, and the 95% confidence interval is represented by notches for the two validated DMPs (C) RARA and (D) F2RL3 and their surrounding CpGs. p-Values indicate statistical significance between healthy (low LS) and unhealthy (high LS) subjects detected using analysis of variance (ANOVA). p-Values are indicated as *p < .05, **p < .01 and ***p < .001. (E) Linear regression analysis between methyl age (methAge) for the Horvath II, telomere length, chronological age and waist-to-hip ratio (WHR) measurements presented as a scatter plot. The light grey area represents the 95% confidence interval, and R 2 represents the coefficient of determination only two eQTMs annotated to the alanyl aminopeptidase (ANPEP) locus were sustained after correction for multiple testing (matrix FDR = 0.03; Table S13, Figure S7). Four eQTMs were detected in subjects with healthy lifestyle and eight with unhealthy lifestyle; among them, one of our candidate DMP of F2RL3 was also detected in healthy subjects; however, none maintained after correction for multiple testing.

DISCUSSION
Epigenetic markers are known to reflect environmental conditions and thereby are affected not only by genetic predisposition but also most strongly by our daily lifestyle. Although this is widely acknowledged by the scientific community, the majority of epigenetic studies in regard to obesity, most of them conducted cross-sectionally, still lack the inclusion of relevant lifestyle drivers. 48 Therefore, to the best of our knowledge, this is one of the few studies investigating the potential effects of lifestyle on the respective blood DNA methylation signatures. 49 Here, we calculated LS scores based on each individual's diet, PA, smoking and alcohol consumption within the LIFE-Adult study from Germany. Genome-wide DNA methylation analysis in blood samples of 100 subjects representing healthy and unhealthy lifestyle extremes demonstrated that daily lifestyle is most likely superior to the obesity state itself in associations with blood DNA methylation patterns, as supported by association studies between neonatal blood methylation and the risk of developing obesity later in life. 50 The study showed that the distribution of obesity categories in extreme lifestyle groups was comparable and that potentially obesity-associated methylation marks were more frequent in subjects with healthy lifestyles. However, this could also be driven by the general exclusion of subjects suffering from diabetes, which may have inadvertently excluded subjects with unhealthy metabolic obesity. Furthermore, mAge and estimated telomere length showed strong correlations with chronological age and WHR, with observed smaller DNAmAge acceleration distances in healthy subjects. Finally, two DMPs for ANPEP also showed the strongest eQTM in blood within the subgroup of 48 subjects. With this study, we took several lifestyle aspects into account to explore relations between long-term lifestyle habits and differences in human blood methylation patterns. Our findings imply that dietary habits, PA, smoking habits and alcohol consumption influence epigenetic patterns together, whereas only neglectable effects are attributed to age and BMI alone. This suggests that rather than simply representing the consequence of obesity, differences in blood-derived methylation marks may be primarily driven by long-term lifestyle habits. This is further supported by the observed smaller DNAmAge acceleration in the healthy lifestyle group compared to the unhealthy lifestyle group, whereas no significant difference could be observed between subjects with and without obesity.
We identified several candidate genes differentially methylated according to the LS and successfully validated RARA and F2RL3, already known from previous studies to be influenced by lifestyle aspects and acknowledged for their role in metabolic diseases. 45,47,51 Both genes were hypermethylated within extremely healthy compared to unhealthy living individuals, which is in line with previously published data. 41,45,52 In particular, hypomethylation of the DMP within the F2RL3 locus appears to increase the risk for cardiovascular as well as overall mortality. 45,52 Translated to our results, this might indicate an increasing mortality risk of an unhealthy lifestyle accompanied by associated diseases, such as obesity, type 2 diabetes, cardiovascular diseases or cancer. Previous studies further showed a hypomethylating effect of smoking on the F2RL3 DMP identified here. 51 Moreover, very recently, a strong association with coffee consumption in a large-scale epigenome-wide association study (EWAS) was reported. 53 Consistent with published data on smoking, subjects with unhealthy lifestyle in our study showed a mean methylation of 67% compared to 81% in the healthy lifestyle group, with the majority of subjects within the unhealthy group being actual smokers (validation cohort). In line with this, we further observed a marginal (p-value = .04) positive correlation between this methylation of DMP and F2RL3 mRNA levels in the healthy lifestyle subgroup.
We found significant methylation differences between the healthy and unhealthy lifestyles for RARA, which is known for its role in adipogenesis. 54 It is noteworthy, however, that based on the findings of the present study, an increase in the RARA methylation pattern might be related to higher HDL cholesterol and lower triglyceride serum levels, indicating a link between RARA and lipid metabolism.
Although the observed differences in DNA methylation in RARA could also be driven by smoking as previously described 46 and supported by the strong correlation with smoking found here, there is still a prominent influence of other environmental conditions, such as diet and PA, as shown by our present data. Nevertheless, it needs to be acknowledged that in line with our study, the majority of methylation studies on smoking, although lacking any information on diet or activity, identified a similar set of top candidates, especially F2RL3, RARA and AHRR, in human blood cells. 41,45,47,52 It is also worth mentioning that smoking effects on F2RL3 methylation were previously also observed in adipose tissue. 47 Since we included smoking as a lifestyle factor, it is possible that some of the identified genes are indeed related to lung cancer. 55,56 Consequently, narrowing down our list of potential lifestyle discriminating candidate regions by including an additional adjustment for smoking resulted in the identification of a top DMR on chromosome 6 annotated to the RNF39. This DMR overlapped with a larger region very recently described to successfully discriminate responders from nonresponders to a lifestyle intervention based on either a Mediterranean/low-carbohydrate or low-fat diet with or without PA. 7 There are a few key limitations to our study. First, we used a scoring system based on self-reported questionnaires, which might lead to euphemistic information, including over-or underestimation of the real status. 57 However, our study design is supported and strengthened by findings, which are in line with previously reported data, for example, on lifestyle factors such as smoking. 45,47 Furthermore, although we excluded individuals suffering from diabetes, we cannot fully exclude effects driven by other non-diabetes-related medications. Finally, the observational and cross-sectional nature of the study does not allow testing the direction of causality at least between methylation and metabolic phenotypes and limits our ability to rule out confounding (e.g., sex), even though it seems unlikely that methylation marks affect lifestyle habits.
Although the identification of reliable and reproducible epigenetic marks for obesity in human blood remains challenging, our study clearly indicates the importance of considering as many lifestyle aspects as possible when analysing epigenetic data with regard to complex diseases such as obesity. We successfully demonstrated that the majority of CpG methylation marks are much more strongly influenced by our daily lifestyle than the obesity state itself.

A C K N O W L E D G E M E N T S
We thank all study participants of the LIFE-Adult study whose personal dedication and commitment have made this project possible. We would like to acknowledge the excellent technical assistance of Beate Gutsmann and Ines Müller. This work has been supported by a young investigator research fund from the Medical Faculty of the University Leipzig, the German Diabetes Association, the Free State of Saxony, Deutsches Zentrum für Diabetesforschung and grants from the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation -Projektnummer 209933838 -SFB 1052; B03, B08, C01, Z04; SPP 1629 TO 718/2-1). LIFE-Adult is funded by the Leipzig Research Center for Civilisation Diseases (LIFE). LIFE is an organisational unit affiliated with the Medical Faculty of the University of Leipzig. LIFE is funded by means of the European Union, by the European Regional Development Fund (ERDF) and by funds of the Free State of Saxony within the framework of the excellence initiative.

C O N S E N T F O R P U B L I C AT I O N
Not applicable.